Waveform similarity based overlap-add (WSOLA) for time-scale modification of speech: structures and evaluation
نویسندگان
چکیده
A synchronization criterion for overlap-add time-scale modification is derived through a least squares estimation of the modified short-time Fourier transform. Based on this finding, a structural time-domain framework for time-scale modification is described. One efficient variant, which was called the Waveform Similarity based Overlap-Add (WSOLA) method, produces high quality output when applied to speech, but can even be applied successfully to a broader class of signals, including multiple voices together and musical instruments. Fine-tuning the synchronization criterion, without affecting the high quality that is obtained, can make the computational cost very low, revealing the versatile possibilities for on-line operation.
منابع مشابه
Bilateral Waveform Similarity Overlap-and-Add Based Packet Loss Concealment for Voice over IP
This paper invested a bilateral waveform similarity overlap-and-add algorithm for voice packet lost. Since Packet lost will cause the semantic misunderstanding, it has become one of the most essential problems in speech communication. This investment is based on waveform similarity measure using overlap-and-Add algorithm and provides the bilateral information to enhance the speech signal recons...
متن کاملAn Overlap-add Technique Based on Waveform Similarity (wsola) for High Quality Time-scale Modification of Speech
A concept of waveform similarity is proposed for tackling the problem of time-scale modification of speech, and is worked-out in the context of short-time Fourier transform representations. The resulting WSOLA algorithm produces high quality speech output, is algorithmically and computationally efficient and robust, and allows for on-line processing with arbitrary timescaling factors that may b...
متن کاملOverlap-add methods for time-scaling of speech
In this tutorial on time scaling we follow one particular line of thought towards computationally efficient high quality methods. We favor time scaling based on time-frequency representations over model based approaches, and proceed to review an iterative phase reconstruction method for time-scaled magnitude spectrograms. The search for a good initial phase estimate leads us to consider synchro...
متن کاملEpoch-Synchronous Overlap-Add (ESOLA) for Time- and Pitch-Scale Modification of Speech Signals
Timeand pitch-scale modifications of speech signals find important applications in speech synthesis, playback systems, voice conversion, learning/hearing aids, etc.. There is a requirement for computationally efficient and real-time implementable algorithms. In this paper, we propose a high quality and computationally efficient timeand pitch-scaling methodology based on the glottal closure inst...
متن کاملNovel speech duration modifier for packet based communication system
In this paper, we propose a real-time method for duration modification of speech for packet based communication system. While there is rich literature available on duration modification, it fails to clearly address the issues in real-time implementation of the same. Most of the duration modification methods rely on accurate estimation of pitch marks, which is not feasible in a real-time scenari...
متن کامل